Fitting semiparametric random effects models to large data sets.

نویسندگان

  • Michael L Pennell
  • David B Dunson
چکیده

For large data sets, it can be difficult or impossible to fit models with random effects using standard algorithms due to memory limitations or high computational burdens. In addition, it would be advantageous to use the abundant information to relax assumptions, such as normality of random effects. Motivated by data from an epidemiologic study of childhood growth, we propose a 2-stage method for fitting semiparametric random effects models to longitudinal data with many subjects. In the first stage, we use a multivariate clustering method to identify G<<N groups of subjects whose data have no scientifically important differences, as defined by subject matter experts. Then, in stage 2, group-specific random effects are assumed to come from an unknown distribution, which is assigned a Dirichlet process prior, further clustering the groups from stage 1. We use our approach to model the effects of maternal smoking during pregnancy on growth in 17,518 girls.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Empirical Bayes fitting of semiparametric random effects models to large data sets

For large data sets, it can be difficult or impossible to fit models with random effects using standard algorithms due to convergence or memory problems. In addition, it would be advantageous to use the abundant information to relax assumptions, such as normality of random effects. Motivated by maternal smoking and childhood growth data from the Collaborative Perinatal Project (CPP), we propose...

متن کامل

Propagation Models and Fitting Them for the Boolean Random Sets

In order to study the relationship between random Boolean sets and some explanatory variables, this paper introduces a Propagation model. This model can be applied when corresponding Poisson process of the Boolean model is related to explanatory variables and the random grains are not affected by these variables. An approximation for the likelihood is used to find pseudo-maximum likelihood esti...

متن کامل

SUGI 28: Smoothing with SAS(r) PROC MIXED

Mixed models are an extension of regression models that allows for incorporation of random effects. The application of mixed-effects models to practical data analysis has greatly expanded with consequent development of theory and computer software. It also turns out that mixed models are closely related to smoothing. Nonparametric regression models, especially the general smoothing spline model...

متن کامل

frailtyEM: An R Package for Estimating Semiparametric Shared Frailty Models

When analyzing correlated time to event data, shared frailty (random effect) models are particularly attractive. However, the estimation of such models has proved challenging. In semiparametric models, this is further complicated by the presence of the nonparametric baseline hazard. Although recent years have seen an increased availability of software for fitting frailty models, most software p...

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Biostatistics

دوره 8 4  شماره 

صفحات  -

تاریخ انتشار 2007